AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Chinese Visual Question Answering

# Chinese Visual Question Answering

Qwen2.5 VL 7B Instruct GGUF
Apache-2.0
Qwen2.5-VL-7B-Instruct is a multimodal vision-language model that supports image-text generation tasks.
Image-to-Text English
Q
samgreen
5,052
9
Aria Sequential Mlp Bnb Nf4
Apache-2.0
A BitsAndBytes NF4 quantized version based on Aria-sequential_mlp, suitable for image-to-text tasks with approximately 15.5 GB VRAM requirement.
Image-to-Text Transformers
A
leon-se
76
11
Vit Gpt2 Image Chinese Captioning
MIT
This model uses ViT for image encoding and GPT-2 for decoding, supporting Chinese image caption generation.
Image-to-Text Transformers Chinese
V
yuanzhoulvpi
22
6
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase